Skip to content

Drop projections to C++ and add cone program refinement#39

Closed
bamos wants to merge 4 commits into
masterfrom
2020.08.28.bda
Closed

Drop projections to C++ and add cone program refinement#39
bamos wants to merge 4 commits into
masterfrom
2020.08.28.bda

Conversation

@bamos

@bamos bamos commented Aug 28, 2020

Copy link
Copy Markdown
Collaborator

Hi, here's the initial version of dropping the projections to C++ and adding the refinement from this paper. What do you all think?

  • I replaced pi to call into C++ and moved the old Python version to pi_python
  • Everything in tests.py works with the C++ projections
  • I tested the refinement code by comparing the iterates to the numba version here. They consistently match and this is consistently able to improve the results. I don't currently have any examples/tests in here but can add one. The runtime with eigen can be somewhat slow compared to the SCS call but long-term maybe this could be optimized
  • I moved the pre-computations into the derivative and adjoint_derivative functions to better-compare the runtimes here. Something else I noticed is that we can improve the runtime a bit by not returning M back up to Python, so I created _solve_adjoint_derivative_lsqr. We can also do the same thing for the derivative, or can revert this if you all want to keep the pre-computations in solve
  • Running prof.py from this PR on the old code and here shows this PR doesn't slow down anything on this example and gives a slight performance boost on the adjoint derivative from solve_adjoint_derivative_lsqr. In the "old code" I also moved the pre-computations into the derivative and adjoint_derivative for a fair comparison here
forward_time adjoint_time adjoint_derivative_time
Old code: 0.1274257183074951 2.0571622371673586 1.5252753496170044
This PR: 0.1262342929840088 2.0498327732086183 1.100463008880615

@bamos bamos requested review from akshayka and sbarratt August 28, 2020 17:27
Comment thread diffcp/cone_program.py
Comment on lines +294 to +306
Q = sparse.bmat([
[None, A.T, np.expand_dims(c, - 1)],
[-A, None, np.expand_dims(b, -1)],
[-np.expand_dims(c, -1).T, -np.expand_dims(b, -1).T, None]
])

D_proj_dual_cone = _diffcp.dprojection(v, cones_parsed, True)
if mode == "dense":
Q_dense = Q.todense()
M = _diffcp.M_dense(Q_dense, cones_parsed, u, v, w)
MT = M.T

pi_z = pi(z, cones)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The derivative callable could conceivably be called more than once for a single solve; in such cases, wouldn't recomputing Q, D_proj_dual_cone be wasteful (or have I missed something)? I'd prefer pre-computing these quantities and capturing them via lexical closure as before. Otherwise, lazy computation of these quantities would also be fine (i.e., they're computed on the first call to derivative, and re-used thereafter). I have the same comment for adjoint_derivative.

Comment thread diffcp/cone_program.py

pi_z = pi(z, cones)

M = _diffcp.M_operator(Q, cones_parsed, u, v, w)

@akshayka akshayka Aug 31, 2020

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these lines supposed to be in an else clause, accompanying line 301 (they overwrite M and MT from lines 303-34)?

Comment thread diffcp/cones.py
Comment on lines +146 to +157
def pi(x, cones, dual=False):
"""Projects x onto product of cones (or their duals)
Args:
x: NumPy array (with PSD data formatted in SCS convention)
cones: list of (cone name, size)
dual: whether to project onto the dual cone
Returns:
NumPy array that is the projection of `x` onto the (dual) cones
"""

cone_list_cpp = parse_cone_dict_cpp(cones)
return projection(x, cone_list_cpp, dual)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool! might as well delete 'pi_python if it's no longer used.

@akshayka

Copy link
Copy Markdown
Member

Thanks for the PR @bamos! It's great that you've moved pi into C++ ... the less time spent in Python (holding the GIL!), the better :)

I did leave a couple comments --- in particular, I'm not sure I understand the motivation for moving the pre-computations for derivative and adjoint_derivative into the returned callables.

I haven't had a chance to review the refinement code. What do you think about splitting this PR into two separate ones --- one that moves pi to C++ (which will be simple for us to review) and one that adds the refinement (which will take longer for us to review).

@bamos

bamos commented Sep 2, 2020

Copy link
Copy Markdown
Collaborator Author

I haven't had a chance to review the refinement code. What do you think about splitting this PR into two separate ones --- one that moves pi to C++ (which will be simple for us to review) and one that adds the refinement (which will take longer for us to review).

Yes, sounds good. I'll close this PR and send in separate PRs for the projections and refinement separately

I did leave a couple comments --- in particular, I'm not sure I understand the motivation for moving the pre-computations for derivative and adjoint_derivative into the returned callables.

I noticed that not returning M back up to Python at all (and calling into the _solve_adjoint_derivative_lsqr I've added here) is significantly faster than returning M and then passing that back down to C++. I moved the other pre-computations just for consistency, but I'll keep this part out of my new PRs. It's still worth looking closer into this --- I think it may also be a bit faster to never construct M and just pass around A,b,c.

@bamos bamos closed this Sep 2, 2020
@bamos bamos mentioned this pull request Sep 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants